In [1]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

QRcode and shapes segmentation and detection with OpenCV

Signals and Systems Course Project II - 13982

Dept. of Electrical Engineering, Sharif University of Tech.

In this Project, our goal is to do semantic segmentation with OpenCV lib, then using additional tools to detect QR codes and size of the shapes in the stream video.

NOTE: this instruction file has been created with helps of various library documents, we respect open source library access, specially OpenCV, NUMPY official docs and STACKOVERFLOW/GITHUB/LearnOCV forums.

https://docs.opencv.org/master/d6/d00/tutorial_py_root.html

https://docs.opencv.org/master/d7/dbd/group__imgproc.html

https://www.ccoderun.ca/programming/doxygen/opencv/tutorial_root.html

Part I : Basics of image

In this part, you should write a report with jupyter notebook about image filters, geometric transforamtions, video processing and their examples. read instructions carefully.

Here we have introduced some examples in order to warm you up :)

Q. Object oriented approach

Once again, Object-oriented approach is one of the goals of this project!

Implement all of these functions in a class called ImageSignal. this should be like cell bellow. consider that you can use some other classes if you think that they enhance your implementation.

In [ ]:
class ImageSignal:
    def __init__(self):
        self.property1 = [] # initializer
        ### TODO
        return
    
    def method_name(self, inputs):
        # TODO
        pass
        return
    
    @staticmethod
    def static_method_name(inputs):
        # TODO
        pass
        return

Q. Load, save and display an image

In this cell, try to load an image file, display it, resize it and then save it.

Load

In [34]:
imgpath = 'images/PGN1.jpg'
img = cv2.imread(imgpath)

Show with cv2

In [9]:
cv2.imshow('Pagani Zonda F', img)
cv2.waitKey(0)

Show with matplotlib

In [36]:
img_ = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(20, 30))
plt.imshow(img_)
plt.axis(False)
plt.show()

Resize

In [37]:
scale_ = 40
width_ = int(img.shape[1]*scale_/100)
height_ = int(img.shape[0]*scale_/100)
dim_ = (width_, height_)
resized_img = cv2.resize(img, dim_, interpolation=cv2.INTER_AREA)
cv2.imshow('Pagani resized', resized_img)
cv2.waitKey(0)
print(img.shape, resized_img.shape)
(2160, 3840, 3) (864, 1536, 3)

Save

In [38]:
imgpath_ = 'images/PGN2.jpg'
cv2.imwrite(imgpath_, resized_img)
Out[38]:
True

Q. Basic filteration with custom kernels

First action in image processing is filteration. You may think that we will take FFT from image, but no :) there is a theorem that makes our work simple:

Theorem I: A Gaussian function's fourier transform in N-dimensional space is another gaussian with different $\sigma$ and $\mu$

So this means that instead of taking FFT and then filter somewhere in that, we are able to convolve filter kernels with image in rgb space. kernels are fxf matrices that represent filteration.

Average filtering

Include two examples of average filtering on arbitrary images in your report.

In [40]:
img = cv2.imread('images/LA1.jpg')
kernel = np.ones((7, 7), np.float32)/49

avg = cv2.filter2D(img, -1, kernel)
plt.figure(figsize=(30, 30))

plt.subplot(211)
plt.imshow(img)
plt.title('Original')
plt.xticks([])
plt.yticks([])

plt.subplot(212)
plt.imshow(avg)
plt.title('Averaging')
plt.xticks([])
plt.yticks([])

plt.show()

Gaussian filtering

In the final report, include some examples of gaussian filtering on gaussian noisy images.

In [29]:
img = cv2.imread('images/LA1.jpg')
kernel10 = cv2.getGaussianKernel(ksize=7, sigma=10)
kernel1 = cv2.getGaussianKernel(ksize=7, sigma=1)

gsn1 = cv2.filter2D(img, -1, kernel10)
gsn2 = cv2.filter2D(img, -1, kernel1)
plt.figure(figsize=(30, 40))

plt.subplot(311)
plt.imshow(img)
plt.title('Original')
plt.xticks([])
plt.yticks([])

plt.subplot(312)
plt.imshow(gsn1)
plt.title('Gaussian with $\sigma$ = 1')
plt.xticks([])
plt.yticks([])

plt.subplot(313)
plt.imshow(gsn2)
plt.title('Gaussian with $\sigma$ = 10')
plt.xticks([])
plt.yticks([])

plt.show()

Median filtering

Median filtering is a non-linear filtering method that is sometimes more efficient than average methods.

In the report, include some examples of median filtering on salt-pepper noisy images with different kernels.

In [51]:
img = cv2.imread('images/LA1.jpg')
med = cv2.medianBlur(img, ksize=7)
plt.figure(figsize=(30, 40))

plt.subplot(211)
plt.imshow(img)
plt.title('Original')
plt.xticks([])
plt.yticks([])

plt.subplot(212)
plt.imshow(med)
plt.title('Median filter')
plt.xticks([])
plt.yticks([])

plt.show()

X. Checkpoint

Now, you've been familiar with OpenCV tools and basic computer vision. In the next two parts, you'll do some complex actions on image signal/data.

Part II: Feature extraction

In this part, you should implement methods to detect various fetures.

1.Edge detection

Edge detection is one of the most important feature extraction methods in image processing. edges carry important information about texture and object description of the image, search about derivative based methods like Cobel, Laplacian and etc. You should report their comparison and examples in the final report.

https://docs.opencv.org/master/d5/d0f/tutorial_py_gradients.html

In [ ]:
 

2.Hough transforms

Hough transforms are usefull tools for circle/line detection. write a method in your class that detects hough lines, and another one for circles.

https://www.geeksforgeeks.org/line-detection-python-opencv-houghline-method/

In [ ]:
 

3. Video processing

The cell bellow is an example of blue color simple segmentation.

Write these methods, then write a method that is able to do stream use of implemented methods.

I. ColorSegmentator(image, min_color, max_color)

This method should indicate pixels with color between min_color and max_color.

II. LinesDetector(image, minlenght)

This method should be able to detect lines with minimum lenght of minlenght.

III. PolygonDetector(image, maxside)

This method should be able to detect polygons with at max maxside sides.

Note that all of the methods above should be methods of ImageSignal class and also be able to be applied on stream video signal.

In [67]:
cap = cv2.VideoCapture(0)

while(True):
    _, frame = cap.read()
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
   
    lower_blue = np.array([50,50,50])
    upper_blue = np.array([225,235,235])
    
    mask = cv2.inRange(hsv, lower_blue, upper_blue)
    res = cv2.bitwise_and(frame,frame, mask= mask)
    
    cv2.imshow('frame',frame)
    cv2.imshow('mask',mask)
    cv2.imshow('res',res)
    k = cv2.waitKey(5) & 0xFF # Escape key

    if k == 27:
        break

cap = cv2.VideoCapture(1)
cv2.destroyAllWindows()

X. Checkpoint

Now, you've seen some examples of working with OpenCV. Most of these materials are covered in class.

Part II: Segmentation

To detect QR codes and Boxes, removing noises or textures is sufficient, but we want to do something that makes our work easier.

In [ ]:
 

1. Image decomposition

Consider that we have two main goals: QR codes and box-shapes. In order to detect these, if we detect the part of image that QR code is in that, we will be able to do some geometric transformations in order to read QR description.

Example of segmentation

Example of segmentation

In fact, segmentation could be very complex, but our task is not to do fully semantic segmentation; we just want to do segmentation in order to detect two features: QR region and rectangles of boxes.

So, implement a method on your ImageSignal class that decomposes the image into its different layers based on the segmentation.

Suggested methods:

Watershed algorithm

https://docs.opencv.org/master/d3/db4/tutorial_py_watershed.html

Probabilistic similarity graph

In this method, write a method that creates similarity graph between adjacent pixels and then, use graph traversal methods like Dijkstra to do semantic segmentation. This is a little harder than basic threshold methods.

2. Regions extraction

Now, you have candidate regions for boxes and QR codes. extract their code and size (in pixel), write their methods on your class, that their output should be content of QR and size of boxes and their axis.

Overall output of this part should be QR code, shapes and size of them (pixel size)

3. Deep learning methods

If you had enough time, implement deep semantic segmentation method with U-NET.

U-NET pytorch code is available in this link:

https://towardsdatascience.com/u-net-b229b32b4a71

But this is what you should do; Apply this net to images in order to read QR code.

In [ ]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F


class UNet(nn.Module):
    def __init__(self):
        super(UNet, self).__init__()
        pass # TODO
        return
    def forward(img):
        pass # TODO
        return

# TODO For QR detection